11 research outputs found

    A discriminative prototype selection approach for graph embedding in human action recognition

    Full text link
    This paper proposes a novel graph-based method for representing a human's shape during the performance of an action. Despite their strong representational power graphs are computationally cumbersome for pattern analysis. One way of circumventing this problem is that of transforming the graphs into a vector space by means of graph embedding. Such an embedding can be conveniently obtained by way of a set of prototype graphs and a dissimilarity measure: yet the critical step in this approach is the selection of a suitable set of prototypes which can capture both the salient structure within each action class as well as the intra-class variation. This paper proposes a new discriminative approach for the selection of prototypes which maximizes a function of the inter-and intra-class distances. Experiments on an action recognition dataset reported in the paper show that such a discriminative approach outperforms well-established prototype selection methods such as center border and random prototype selection. © 2011 IEEE

    PersoNER: Persian named-entity recognition

    Full text link
    © 1963-2018 ACL. Named-Entity Recognition (NER) is still a challenging task for languages with low digital resources. The main difficulties arise from the scarcity of annotated corpora and the consequent problematic training of an effective NER pipeline. To abridge this gap, in this paper we target the Persian language that is spoken by a population of over a hundred million people world-wide. We first present and provide ArmanPerosNERCorpus, the first manually-annotated Persian NER corpus. Then, we introduce PersoNER, an NER pipeline for Persian that leverages a word embedding and a sequential max-margin classifier. The experimental results show that the proposed approach is capable of achieving interesting MUC7 and CoNNL scores while outperforming two alternatives based on a CRF and a recurrent neural network

    Regressing Word and Sentence Embeddings for Regularization of Neural Machine Translation

    Full text link
    In recent years, neural machine translation (NMT) has become the dominant approach in automated translation. However, like many other deep learning approaches, NMT suffers from overfitting when the amount of training data is limited. This is a serious issue for low-resource language pairs and many specialized translation domains that are inherently limited in the amount of available supervised data. For this reason, in this paper we propose regressing word (ReWE) and sentence (ReSE) embeddings at training time as a way to regularize NMT models and improve their generalization. During training, our models are trained to jointly predict categorical (words in the vocabulary) and continuous (word and sentence embeddings) outputs. An extensive set of experiments over four language pairs of variable training set size has showed that ReWE and ReSE can outperform strong state-of-the-art baseline models, with an improvement that is larger for smaller training sets (e.g., up to +5:15 BLEU points in Basque-English translation). Visualizations of the decoder's output space show that the proposed regularizers improve the clustering of unique words, facilitating correct predictions. In a final experiment on unsupervised NMT, we show that ReWE and ReSE are also able to improve the quality of machine translation when no parallel data are available

    ReWE: Regressing word embeddings for regularization of neural machine translation systems

    Full text link
    Regularization of neural machine translation is still a significant problem, especially in low-resource settings. To mollify this problem, we propose regressing word embeddings (ReWE) as a new regularization technique in a system that is jointly trained to predict the next word in the translation (categorical value) and its word embedding (continuous value). Such a joint training allows the proposed system to learn the distributional properties represented by the word embeddings, empirically improving the generalization to unseen sentences. Experiments over three translation datasets have showed a consistent improvement over a strong baseline, ranging between 0.91 and 2.54 BLEU points, and also a marked improvement over a state-of-the-art system

    Prototype generation on structural data using dissimilarity space representation

    Get PDF
    Data reduction techniques play a key role in instance-based classification to lower the amount of data to be processed. Among the different existing approaches, prototype selection (PS) and prototype generation (PG) are the most representative ones. These two families differ in the way the reduced set is obtained from the initial one: While the former aims at selecting the most representative elements from the set, the latter creates new data out of it. Although PG is considered to delimit more efficiently decision boundaries, the operations required are not so well defined in scenarios involving structural data such as strings, trees, or graphs. This work studies the possibility of using dissimilarity space (DS) methods as an intermediate process for mapping the initial structural representation to a statistical one, thereby allowing the use of PG methods. A comparative experiment over string data is carried out in which our proposal is faced to PS methods on the original space. Results show that the proposed strategy is able to achieve significantly similar results to PS in the initial space, thus standing as a clear alternative to the classic approach, with some additional advantages derived from the DS representation.This work was partially supported by the Spanish Ministerio de Educación, Cultura y Deporte through a FPU fellowship (AP2012–0939), Vicerrectorado de Investigación, Desarrollo e Innovación de la Universidad de Alicante through FPU program (UAFPU2014–5883), and the Spanish Ministerio de Economía y Competitividad through Project TIMuL (No. TIN2013-48152-C2-1-R supported by EU FEDER funds)

    Sequential Labeling with Structural SVM under Nondecomposable Losses

    Full text link
    © 2012 IEEE. Sequential labeling addresses the classification of sequential data, which are widespread in fields as diverse as computer vision, finance, and genomics. The model traditionally used for sequential labeling is the hidden Markov model (HMM), where the sequence of class labels to be predicted is encoded as a Markov chain. In recent years, HMMs have benefited from minimum-loss training approaches, such as the structural support vector machine (SSVM), which, in many cases, has reported higher classification accuracy. However, the loss functions available for training are restricted to decomposable cases, such as the 0-1 loss and the Hamming loss. In many practical cases, other loss functions, such as those based on the F1 measure, the precision/recall break-even point, and the average precision (AP), can describe desirable performance more effectively. For this reason, in this paper, we propose a training algorithm for SSVM that can minimize any loss based on the classification contingency table, and we present a training algorithm that minimizes an AP loss. Experimental results over a set of diverse and challenging data sets (TUM Kitchen, CMU Multimodal Activity, and Ozone Level Detection) show that the proposed training algorithms achieve significant improvements of the F1 measure and AP compared with the conventional SSVM, and their performance is in line with or above that of other state-of-the-art sequential labeling approaches

    BILSTM-CRF for Persian named-entity recognition armanpersonercorpus: The first entity-annotated Persian dataset

    Full text link
    © LREC 2018 - 11th International Conference on Language Resources and Evaluation. All rights reserved. Named-entity recognition (NER) can still be regarded as work in progress for a number of Asian languages due to the scarcity of annotated corpora. For this reason, with this paper we publicly release an entity-annotated Persian dataset and we present a performing approach for Persian NER based on a deep learning architecture. In addition to the entity-annotated dataset, we release a number of word embeddings (including GloVe, skip-gram, CBOW and Hellinger PCA) trained on a sizable collation of Persian text. The combination of the deep learning architecture (a BiLSTM-CRF) and the pre-trained word embeddings has allowed us to achieve a 77.45% CoNLL F1 score, a result that is more than 12 percentage points higher than the best previous result and interesting in absolute terms

    English-Basque statistical and neural machine translation

    Full text link
    © LREC 2018 - 11th International Conference on Language Resources and Evaluation. All rights reserved. Neural Machine Translation (NMT) has attracted increasing attention in the recent years. However, it tends to require very large training corpora which could prove problematic for languages with low resources. For this reason, Statistical Machine Translation (SMT) continues to be a popular approach for low-resource language pairs. In this work, we address English-Basque translation and compare the performance of three contemporary statistical and neural machine translation systems: OpenNMT, Moses SMT and Google Translate. For evaluation, we employ an open-domain and an IT-domain corpora from the WMT16 resources for machine translation. In addition, we release a small dataset (Berriak) of 500 highly-accurate English-Basque translations of complex sentences useful for a thorough testing of the translation systems

    Complex event recognition by latent temporal models of concepts

    No full text
    © 2014 IEEE. Complex event recognition is an expanding research area aiming to recognize entities of high-level semantics in videos. Typical approaches exploit the so-called 'bags' of spatiotemporal features such as STIP, ISA and DTF-HOG; yet, more recently, the notion of concept has emerged as an alternative, intermediate representation with greater descriptive power, and 'bags of concepts' have been used for recognition. In this paper we argue that concepts in an event tend to articulate over a discernible temporal structure and we exploit a temporal model using the scores of concept detectors as measurements. In addition, we propose several heuristics to improve the initialization of the model's latent states and take advantage of the time-sparsity of the concepts. Experimental results on videos from the challenging TRECVID MED 2012 dataset show that the proposed approach achieves an improvement in average precision of 8.92% over comparable bags of concepts, thus validating the use of temporal structure over concepts for complex event recognition

    Joint action segmentation and classification by an extended hidden markov model

    Full text link
    Hidden Markov models (HMMs) provide joint segmentation and classification of sequential data by efficient inference algorithms and have therefore been employed in fields as diverse as speech recognition, document processing, and genomics. However, conventional HMMs do not suit action segmentation in video due to the nature of the measurements which are often irregular in space and time, high dimensional and affected by outliers. For this reason, in this paper we present a joint action segmentation and classification approach based on an extended model: the hidden Markov model for multiple, irregular observations (HMM-MIO). Experiments performed over a concatenated version of the popular KTH action dataset and the challenging CMU multi-modal activity dataset (CMU-MMAC) report accuracies comparable to or higher than those of a bag-of-features approach, showing the usefulness of improved sequential models for joint action segmentation and classification tasks. © 1994-2012 IEEE
    corecore